Does Google Scholar contain all highly cited documents (1950-2013)?

نویسندگان

  • Alberto Martín-Martín
  • Enrique Orduña-Malea
  • Juan Manuel Ayllon
  • Emilio Delgado López-Cózar
چکیده

The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that hopefully shed some light about the use of GS as a tool for assessing scientific impact through citations. The decalogue of research questions is shown below: 1. Which are the most cited documents in GS? 2. Which are the most cited document types in GS? 3. What languages are the most cited documents written in GS? 4. How many highly cited documents are freely accessible? a. What file types are the most commonly used to store these highly cited documents? b. Which are the main providers of these documents? 5. How many of the highly cited documents indexed by GS are also indexed by WoS? 6. Is there a correlation between the number of citations that these highly cited documents have received in GS and the number of citations they have received in WoS? 7. How many versions of these highly cited documents has GS detected? 8. Is there a correlation between the number of versions GS has detected for these documents, and the number citations they have received? 9. Is there a correlation between the number of versions GS has detected for these documents, and their position in the search engine result pages? 10. Is there some relation between the positions these documents occupy in the search engine result pages, and the number of citations they have received? To answer these questions, a set of 64,000 documents indexed in Google Scholar has been collected, after performing 64 queries by year (from 1950 to 2013) using Google Scholar’s advanced search, and collecting the maximum number of records that GS displays for any given query, which as we know is always 1,000. These 64,000 documents receive 122,245,865 citations in Google Scholar and 35,182,077 in Web of Science Core Collection. Full raw data available at: http://dx.doi.org/10.6084/m9.figshare.1224314

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A two-sided academic landscape: portrait of highly-cited documents in Google Scholar (1950-2013)

The main objective of this paper is to identify the set of highly-cited documents in Google Scholar and to define their core characteristics (document types, language, free availability, source providers, and number of versions), under the hypothesis that the wide coverage of this search engine may provide a different portrait about this document set respect to that offered by the traditional b...

متن کامل

Does it Matter Which Citation Tool is Used to Compare the h-index of a Group of Highly Cited Researchers?

h-index retrieved by citation indexes (Scopus, Google scholar, and Web of Science) is used to measure the scientific performance and the research impact studies based on the number of publications and citations of a scientist. It also is easily available and may be used for performance measures of scientists, and for recruitment decisions. The aim of this study is to investigate the difference ...

متن کامل

CIDS country rankings: comparing documents and citations of USA, UK and China top researchers

This technical report presents a bibliometric analysis of the top 30 cited researchers from USA, UK and China. The analysis is based on Google Scholar data using CIDS. The researchers were identified using their email suffix: edu, uk and cn. This näıve approach was able to produce rankings consistent with the SCImago country rankings using mininal resources in a fully automated way.

متن کامل

Free articles and Accounting for the timing effect

Various studies have attempted to assess the amount of free full text available on the web and recent work have suggested that we are close to the 50% mark for freely available articles (Archambault et al. 2013; Björk et al. 2010; Jamali and Nabavi 2015). Our paper contributes to the literature by taking into account the timing issue by studying when the papers were made free. We sampled citati...

متن کامل

Comparison and Analysis of the Citedness Scores in Web of Science and Google Scholar

An increasing number of online information services calculate and report the citedness score of the source documents and provide a link to the group of records of the citing documents. The citedness score depends on the breadth of source coverage, and the ability of the software to identify the cited documents correctly. The citedness score may be a good indicator of the influence of the docume...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1410.8464  شماره 

صفحات  -

تاریخ انتشار 2014